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© A microprocessor. 



© A microprocessor for executing a fixed length 
instruction in a program memory comprises a bus 
interface unit (17) connected to an execution unit (2) 
and an instruction control unit (3). The bus interface 
includes an aligner (27) which sorts out each instruc- 
tion from an aggregate of one instruction and a part 
of the next instruction contained in the 32-bit data 



supplied through the data input/output path (20). The 
bus interface makes the number of bits of the fixed 
length instruction supplied to the instruction control 
unit smaller than the number of bits of data concur- 
rently transferred to and from the execution unit (2). 
A system for using the microprocessor improves the 
utilization efficiency of memory that stores program. 
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Background of the Invention 

Field of the Invention: 

The present invention relates to a microproces- 
sor and more particularly to an architecture of a 
RISC (reduced instruction set computer) type 
microprocessor, and provides a technique for more 
effectively controlling an apparatus in which the 
microprocessor is incorporated. 

Description of the Prior Art 

Numerous efforts have been made to enhance 
overall microprocessor system performance and to 
more easily enable system programs such as com- 
pilers and operating systems to be written, by 
upgrading the microprocessor functions and provid- 
ing as many instruction sets, addressing modes 
and data types as possible. This, however, has 
resulted not only in an increased complexity of 
microprocessor and a longer period for its develop- 
ment, but also in a longer processing time taken by 
the control circuit, thereby giving rise to a problem 
of one being of performance degradation of the 
microprocessor. Under these circumstances, a 
RISC concept has been proposed for solving this 
problem by simplifying the instructions set. This 
concept is based on the fact that the instructions 
that are most frequently executed, as close exami- 
nations revealed, are mostly simple instructions, 
such as LOAD and STORE. In order to improve 
microprocessor efficiency it is considered impor- 
tant to increase the speed of executing these sim- 
ple instructions. 

Conventional RISC type microprocessors em- 
ploy a fixed-length instruction format, say 32 bits 
long, as mentioned in the "CY7C600 Family Users' 
Guide" (Cypress Semiconductor) 1988, p 2-18 to p 
2-29. The instruction format described in this litera- 
ture can have up to 128 kinds of operation codes. 
However, 34 percent of these codes or 44 kinds of 
operation codes are not defined. 

Problems to be Solved by the Invention: 

Since a RISC processor bases itself on an 
architecture primarily intended to reduce the num- 
ber of instructions for high speed of instruction 
execution, it can safely be said of the RISC proces- 
sors in general that the number of undefined opera- 
tion codes in the instruction set tends to increase. 

The presence of many undefined operation 
codes means that there are many virtually useless 
bit strings in each operation code. This deteriorates 
the cod efficiency of object programs, or, in other 
words, lowers the utilization effici ncy of memory. 
No provision has be n made to compensat for 



this inefficiency in the conventional microproces- 
sors. The code efficiency of the RISC processors is 
generally said to be lower than 70 percent of that 
of CISC (complex instruction set computer) type 

5 processors. In the program memory, therefore, the 
area that is virtually wasted become relatively 
large, which, as the inventors have found, causes 
the following problems. In applications where on- 
board memories with limited capacity or on-chip 

io program memories on the processors are used, 
such an inefficient utilization of memory will result 
in insufficient memory or make it impossible to up- 
scale the circuit during system design. 

The object of this invention is to provide a 

75 microprocessor which improves the utilization effi- 
ciency of memory that stores program. 

This and other objects and novel features of 
this invention will become apparent from the follow- 
ing description taken in conjunction with the ac- 

20 companying drawings. 

Brief Summary of the Invention 

One feature of the invention is that in a micro- 

25 processor in which instructions are decoded and 
processed by an instruction control unit having a 
constant word length, the instruction word length is 
set shorter than the maximum number of bits in 
unit data, i.e., the maximum word length of data 

30 that can be handled by an execution unit. In other 
words, the instruction control unit decodes instruc- 
tions made up of a constant number of bits, which 
are smaller in number than the maximum number 
of bits contained in unit data that can be handled 

35 by the execution unit. 

Another feature of the invention is that for the 
data transfer to and from the program memory and 
data memory, a bus interface unit may be provided 
• to make the number of bits of each instruction 

40 supplied to the instruction control unit smaller than 
the number of bits of data transferred to and from 
the execution unit. 

Yet another feature is that to prevent the data 
and instruction access control performed by the 

45 instruction control unit from becoming complex, the 
bus interface unit may be provided with an inter- 
face path that is commonly used for input and 
output of data and for input of instruction. It is 
preferred from the standpoint of improved efficien- 

50 cy of data transfer and of data and instruction 
access that the interface path also have a width 
large enough to parallelly transfer data consisting 
of the maximum number of bits of unit data that 
can be handled by the xecution unit 

55 To make it possible to tak in instructions by 

using the entire width of the interface path, which is 
wider than the instruction word length, th following 
means may b added. That is, th instruction con- 
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trol unit is provided with an instruction prefetch 
queue which accumulates instruction to be given to 
the instruction decoding means, and the bus inter- 
fac unit is provided with a means to sort out each 
instruction from an aggregate of one instruction- 
and a part of another instruction contained in in- 
formation entered through the interface path. 

To ensure flexibility of address arrangement on 
the memory used for storing data and instruction, 
the instruction control unit may manage data and 
instructions in the same address space. 

While the data memory and program memory 
connected to the interface path may be located on 
a separate chip other than the microprocessor chip, 
these memories may better be arranged on the 
microprocessor chip when high-speed access to it 
is to be realized. 

in an architecture like a RISC where the num- 
ber and kind of operation codes is reduced to 
increase the instruction execution speed, the shor- 
ter instruction word length than the data word 
length acts to reduce the number of virtually use- 
less bit strings in the operation code, as compared 
with a case where the instruction and data word 
lengths are set equal. This, in turn, enhances the 
utilization of memory for storing programs. 

Brief Description of the Drawings 

Rg. 1 is a block diagram of a microprocessor 
formed in accordance with one embodiment of 
this invention; 

Rg. 2 is a block diagram showing an example 
detail of the microprocessor of Rg. 1; 
Rg. 3 is a logic circuit diagram showing one 
example logic contained in the data bus inter- 
face unit of Rg. 2 for sorting out bits from a 32- 
bit input; 

Rg. 4 is a schematic diagram showing the pro- 
cess of sorting out and feeding the instructions 
according to the bit sorting logic; 
Rg. 5 is one example format of the instruction; 
Rg. 6 is an address map showing the program 
memory and the data memory located in a 
common address space; 

Rg. 7 is an address map showing the program 
memory and the data memory located in sepa- 
rate address spaces; 

Rg. 8 is one example block diagram showing a 
microprocessor that incorporates the data mem- 
ory and the program memory; 
Rg. 9 is one example block diagram showing a 
microprocessor with a dedicated data bus inter- 
face unit and a dedicated program bus interface 
. unit; 

Rg. 10a is a timing diagram showing the 
instructions of Rg. 10b, sorted as in Rg. 4, 
during their communication through th circuit 
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elements of Rg. 3; 

Rg. 1 1 is a flowchart illustrating the processing 
steps of the operation of the aligner of Rg. 3; 
Rg. 12 is a detailed schematic diagram of an 

5 alternative embodiment of an aligner as may be 

employed in the circuit diagram of Rg. 3; and, 
Rg. 13 is a timing diagram showing the 
progress of an instruction set through the aligner 
of Rg. 12; and, 

io Rg. 14 is schematic diagram of an aligner as 
may be used in the circuit of Rg. 3. 

Detailed Description of the Preferred Embodi- 
ments 

75 

Referring now to the drawings wherein the 
showings are for purposes of illustrating the pre- 
ferred embodiments of the invention only and not 
for purposes of limiting same; the FIGURES show 

20 a microprocessor using an RISC instruction set 
wherein the instructions are communicated through 
the processor with a lesser word length than a data 
unit word length. 

Rg. 1 shows a microprocessor formed in ac- 

25 cordance with one embodiment of this invention. 

The microprocessor 1 of Rg. 1 has an RISC 
type architecture, in which the instruction set is 
comprised of a small number of frequently used 
instructions with relatively short execution times, 

30 such as LOAD and STORE, in order to increase the 
overall execution speed of instructions. As to an 
instruction requiring complex processing, a plurality 
of relatively simple instructions contained in the 
instruction set are combined to perform the pro- 

35 cessing equivalent to the complex instruction. Al- 
though the execution time of such an instruction 
with a low frequency of use is long, the ratio in 
* execution frequency of the simple instructions with 
relatively short execution times to the whole 

40 instructions is high, so that the overall instruction 
execution speed in the microprocessor as a whole 
is increased. 

The microprocessor 1 includes an execution 
unit 2, an instruction control unit 3 to decode 

45 instructions and control the execution unit 2, and a 
bus interface unit 4 connected to the execution unit 
2 and the instruction control unit 3 and also inter- 
faced with external circuits. These units are all 
preferably formed in a single semiconductor silicon 

so chip. 

The number of bits in each instruction decoded 
and processed by the instruction control unit 3, i.e., 
the instruction word length, is set constant like 
other general RISC type microproc ssors. 
55 Most importantly, though, in Fig. 1, th word 

length of instructions processed by the instruction 
control unit 3 is set smaller than the maximum 
numb r of bits of unit data, i. the maximum word 

3 



5 



EP 0 472 025 A2 



6 



length of data, that can be simultaneously trans- 
ferred to and from the execution unit 2. For exam- 
ple, when the maximum word length of data is 32 
bits, the instruction word length is set at 24 bits. 
Other data/instruction word length combinations are 
of course within the scope of the invention. The 
bus interface unit 4 gives instructions to the in- 
struction control unit 3 through a 24-bit internal bus 
5 and also transfers data to and from the execution 
unit 2 through a 32-bit internal bus 6. The maxi- 
mum word length of unit data that can be handled 
by the execution unit 2 is determined by the maxi- 
mum number of bits that can be entered into an 
arithmetic logic unit and registers contained in the 
execution unit 2 or by the width in bits of the 
internal bus of the execution unit 2. 

In Fig. 1 , when the input and output of data and 
the input of instructions are carried out through a 
common external data bus 7, the external data bus 
7 may use an arbitrary number of bits, such as 32 
bits or 64 bits. Suppose the external data bus 7 is 
64 bits wide. When information taken in 64 bits at a 
time through the external data* bus 7 is data, the 
bus interface unit 4 divides the data into two high- 
order and low-order sections of 32 bits each and 
transfers the data 32 bits at a time to the execution 
unit 2. When the information entered 64 bits at a 
time is an instruction, the instruction is divided into 
24-bit sections and at the same time odd bits are 
prefetched by the instruction control unit 3. The 
logic necessary for these processings is contained 
in the last 24-bit section containing the odd bits. 

Fig. 2 shows one detailed example of the 
microprocessor of Fig. 1. 

In Fig. 2, denoted 10 is a register file, 11 an 
arithmetic logic unit, 12 an address output register, 
13 a data input register, 14 a data output register, 
15 an instruction prefetch queue. 16 an instruction 
decoder, 17 a data bus interface unit and 18 an 
address bus interface unit. 

The data bus interface unit 17 is connected to 
the external data bus 7 of Fig. 1 through a 32-bit 
data input/output path 20. which is commonly used 
for input/output of data and for input of instruction, 
and through a data input/output buffer (not shown) 
connected with the path 20. In this example, the 
external data bus 7 is set to 32 bits. The 32-bit 
internal bus 6 used for data transfer between the 
bus interface unit 4 and the execution unit 2 is 
shown in Fig. 2 as a read bus 6R and a write bus 
6W, each with a 32-bit bus width. The instructions 
taken through the data input/output path 20 into the 
data bus interface unit 17 are fed through a 24-bit 
internal bus 5 to the instruction prefetch queue 15 
where they ar accumulated. The instructions that 
are successively read out from the instruction 
prefetch queu 15 are decoded by the instruction 
decoder 16 to generate various control signals. The 
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addresses of the instructions to be prefetched are 
held by a program counter allocated, for example, 
in the register file 10. Other control logics for 
instruction execution sequence and interrupt con- 

5 - trol and for using internal calculation results for the 
control are not particularly shown in Fig. 2. 

The register file 10 is used as an accumulator, 
an address counter and a general-purpose register. 
The arithmetic logic unit 11 may include, for in- 

10 stance, an arithmetic and logic unit (ALU) for data 
calculation and an arithmetic unit for address cal- 
culation. The computed address is output to the 
external circuit through the address output register 

12. The execution unit 2 has 32-bit internal buses 
rs 21, 22, 23, through which data is transferred be- 
tween internal blocks. 

Fig. 3 shows one example logic circuit diagram 
in the data bus interface unit 17 for distributing a 
32-bit input. 

20 In the data bus interface unit 17, the 32-bit 

input/output path 20 is connected to the input of a 
data selector 25 and to the input of a data latch 
circuit 26. 

The data selector 25 operates as follows when 
25 the 32-bit data supplied from outside is to be 
processed by the execution unit 2. According to an 
output control signal 50 from the instruction de- 
coder 16, the data selector 25 selects the input 32- 
bit data, which is to be processed by the execution 
30 unit 2, and sends the selected data to the data 
input register 13. 

Alternatively, the data selector 25, according to the 
output control signal 50 from the instruction de- 
coder 16, gives to the data input register 13 imme- 

35 diate values of instructions prefetched by the in- 
struction prefetch queue 15. Further, the data se- 
lector 25 has a logic that swaps data between the 
16 high-order bits and the 16 low-order bits of the 
32-bit input data and allows the changed 32-bit 

40 input data to be supplied to the data input register 

13. It also has a logic that allows an immediate 
value fed from the instruction prefetch queue 1 5 to 
be supplied to the 16 high-order bits or 16 low- 
order bits of the data input register 13. The control 

45 of these logics is performed by an output control 
signal 50 from the instruction decoder 16. 

The data latch circuit 26 holds 32-bit data and 
outputs the data parallelly to an aligner 27. The 
aligner 27 sorts out each instruction from an ag- 

50 gregate of one instruction (24 bits) and a part of 
the next instruction (8, 16 or 24 bits), contained in 
the 32-bit data supplied through the data 
input/output path 20. according to the data arrange- 
ment, and distribut s the sorted data to the instruc- 

55 tion prefetch queue 15. In this embodiment, the 
prefetch queu 15 consists of an FIFO (first-in-first- 
out) register having three memory stages of 24 bits 
each. The sorting control on the aligner 27 and th 

4 
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read/write control on the instruction prefetch queue 
15 are performed by a control logic 28 that re- 
ceives an output control signal 52 from the instruc- 
tion decoder 16. 

The control logic 28 determines if information 
-taken into the data bus interface unit 17 through 
the data input/output path 20 is an instruction or 
not, by checking the control signal output from the 
instruction decoder 16 which represents an instruc- 
tion prefetch cycle. When the control signal in- 
dicates the instruction prefetch cycle, the control 
logic 28 enables the aligner 27 to control the data 
alignment by sorting out each instruction from an 
aggregate of instruction and parts thereof, taken 
from the data path 20. in synchronism with this 
processing, the control logic 28 controls the 
read/write operation on the instruction prefetch 
queue 15. For example, when instruction data 
IDATA1 (32 bits) of Fig. 4 is taken from the latch 
circuit 26 into the aligner 27, the aligner 27 first 
separates and outputs an instruction INST1 made 
up of 24 low-order bits. The instruction prefetch 
queue 15 receives the instruction INST1 and stores 
it in the memory stage 15A. Then, the aligner 27 
picks up from latch 26 and outputs a part of the 
next instruction INST2 contained in the 8 high- 
order bits of the instruction data IDATA1. The in- 
struction prefetch queue 1 5 receives the part of the 
next instruction INST2 and then stores it in the 
memory stage 15B. Next, when the instruction data 
IDATA2 (32 bits) is taken in, the aligner 27 first 
picks up and outputs the remaining 16 bits of the 
second instruction INST2 contained in the 16 low- 
order bits of the instruction data. The instruction 
prefetch queue 15 receives the remaining 16 bits 
of the instruction INST2 and stores it in the mem- 
ory stage 15B. Then, the aligner 27 picks up and 
outputs a part (16 bits) of the next instruction 
INST3 contained in the 16 high-order bits of the 
instruction data IDATA2. The instruction prefetch 
queue 15 receives the part of the third instruction 
INST3 and stores it in the memory stage 15C. As 
shown in Fig. 4. there is a certain regularity to the 
manner in which the instructions are separated and 
fed from the 32-bit instruction data. So, the control 
logic 28 performs the data sorting control accord- 
ing to this regularity. In this way, the instructions 
contained in the 32-bit instruction data are succes- 
sively sorted and prefetched into the instruction 
prefetch queue 15. These prefetched instructions 
are all 24 bits long so that the instruction decoder 
can decode the instructions. 

Figs. 10 and 11 provide a more detailed il- 
lustration of the selective alignment of the 32-bit 
word received in the data latch circuit 26 (Fig. 3) so 
that the instruction may be properly communicated 
to the prefetch queue 15 with only a small r. 24-bit 
bus communication. Mor particularly, the aligner 



27 will at least comprise an input register, a shifter 
(Fig. 14), and an output register in order to properly 
communicate an instruction from the latch circuit 
26 to the prefetch queue 15. With particular refer- 

5 ence to Fig. 1 1 , it can be seen that when the data 
is latched into the aligner input register the control 
logic 28 identifies whether it is of a form of 
I DATA 1 , IDATA2, or I DAT A3, (note Figs. 4 and 
10b). When it is the form of IDATA1, the lowermost 

10 24 bits will be a single instruction INST1 while the 
upper 8 bits will be a portion of a next instruction 
INST2. 2a, when it is the form of IDATA2, the 
lowermost 16 bits will be the higher 16 bits of the 
second instruction INST2, 2b and therefore must 

75 be combined with the uppermost 8 bits 2a from 
IDATA1. The uppermost 16 bits of IDATA2 com- 
prise the 16 lower bits of the third instruction, 
INST3, 3A, and an instruction of the form I DAT A3 
will have as its lowermost 8 bits, the 8 higher bits 

20 of INST3. 3B, while the uppermost 24 bits will 
comprise a complete instruction for the fourth in- 
struction, INST4, 4. Since in the particular example 
being discussed, all of the instructions will be of 
the three forms. IDATA1. IDATA2, I DAT A3, then 

25 they can be aligned with repetitive regularity after 
determination of what type of data form the instruc- 
tion is in. 

Continuing with reference to Fig. 11, when the 
instruction is of the form IDATA1, the 24 least 

30 significant bits are latched into the aligner output 
register and then the contents of the aligner input 
register are rotated by a barrel shifter (Fig. 14) 8 
bits to the right and the 8 least significant bits are 
then latched into the aligner output register. As 

as noted above, since the queue is a first-in-first-out 
prefetch queue, the 24 least significant bits of 
IDATA1 would be latched into the memory stage 
15A. After the rotation of the contents of the input 
register by the barrel shifter so that the 8 least 

40 significant bits comprising the 8 least significant 
bits of INST2 are rotated and stored into the least 
significant bits of the aligner output register, then 
these 8 bits can then be latched into the least 
significant bits of the memory stage 15B. When the 

45 data is of the form IDATA2, the contents of the 
aligner input register are rotated right 8 bits and 
the middle 16 bits of the aligner input register are 
then latched into the 16 most significant bits of the 
aligner output register. One should bear in mind, 

so that the 8 most significant bits of the shifter need 
not be connected to the aligner output register 
since the output register will only communicate 24 
bits to the prefetch queue. Accordingly, upon com- 
pletion of the first latch into the aligner output 

55 regist r of IDATA2, the 1NST2, will be completely 
received into the memory stage 15B of the queue 
15. Since the upper 16 bits of IDATA2 comprises 
the lower 16 bits of INST3. the second rotation of 
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the contents of IDATA2 data occurs by rotating 
them right 16 bits so that the 16 least significant 
bits of the aligner input register can be latched into 
the aligner output register. After completion of this 
stage, the aligner output register can then be 
latched into the 16 least significant bits of the 
memory stage 15C so that the 16 least significant 
bits of INST3 are properly stored in the memory 
stage 15C. Where the instruction data is of the 
form I DAT A3, the 8 least significant bits of IDATA3 
comprise the 8 most significant bits of INST3. By 
rotating the aligner input register contents 16 bits 
to the right, these bits can then be latched into the 
aligner output register. With reference to Fig. 11, 
these bits are referred to in the figure as the 8 
upper-lower bits, which are intended to comprise 
bits 16-23 after rotation. After completion of this 
latching, it can be appreciated that memory stages 
15A, 15B, 15C will respectively include the three 
instructions INST1. INST2, INST3. INST1 can then 
be communicated from the memory stage 15A to 
the instruction decoder 16 and memory stage 15C 
will then become available by respectively moving 
up INST2 and INST3 into memory stages 15A and 
15B, respectively, as would occur in a FIFO queue. 
INST4 can be latched into the memory stage 15C 
subsequent to this movement by rotating the AIR 
contents left 8 bits so the 24 least significant bits of 
the AIR comprise INST4, which are then latched 
into the aligner output register. Accordingly, INST4 
would then be placed in the memory stage 1 5C. 

A timing diagram particularly illustrating the 
timing results of this processing is shown in Fig. 
10a. The timing diagram is somewhat inconsistent 
with the above-explanation, which was made for 
purposes of clarity of illustration, in that the timing 
diagram of Fig. 10a shows a continuous shifting of 
the instructions from the prefetch queue in 
cooperation with the continuous latching of instruc- 
tions from the aligner output register. Fig. 10a is a 
more accurate representation of the actual timing 
and disposition of the queue memory stages in a 
continuous operation. For purposes of simplicity, 
INST1 is identified in Fig. 10b as merely 1, while 
INST2 comprises 2A, 2B, INST3 comprises 3A, 3B, 
etc. 

Fig. 12 shows one alternative example block 
diagram of the aligner 27. The aligner 27 com- 
prises a two stage data latch (latch A, latch B, and 
multiplexers (MUX1, MUX2, MUX3). The latch A 
and the latch B are 32 bit register and their outputs 
A2, A3, A4, B1, B2, and B3 are connected to the 
multiplexers MUX1, MUX3, and MUX3 as shown in 
Fig. 12A. Outputs of muftiptex rs Ml, M2, M3 are 
controlled by the control logic 28 as shown in Fig. 
12B. 

For example, first when instruction data 
I DATA 1 of Fig. 4 is moved to the latch B from the 
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data latch circuit 26, the multiplexers select B1. B2, 
and B3, and the aligner 27 separates and outputs 
the instruction INST1 contained in the least signifi- 
cant 24 bits of the instruction data IDATA1. The 

5 instruction prefetch queue 15 receives the instruc- 
tion INST1 and stores it in the memory stage 15A. 

Second, when the instruction data I DATA 1 is 
moved to the latch A and the instruction data 
IDATA2 is moved to the latch 270B. the multiplex- 

io ers select A4, B1, and B2, and the aligner 27 
separates and outputs the instruction INST2 con- 
tained in the most significant 8 bits of the instruc- 
tion data IDATA1 and the least significant 16 bits of 
the instruction data IDATA2. The instruction 

is prefetch queue 15 receives the instruction INST2 
and stores it in the memory stage 1 5B. 

Third, when the instruction data IDATA2 is 
moved to the latch A and the instruction data 
I DAT A3 is moved to latch B, the multiplexers select 

20 A3, A4, and B1 , and the aligner 27 separates and 
outputs the instruction INST3 contained in the most 
significant 16 bits of the instruction data IDATA2 
and in the least significant 8 bits of the instruction 
data I DAT A3. The instruction prefetch queue 15 

25 receives the instruction INST3 and stores it in the 
memory stage 15C. 

Next, when the instruction data I DAT A3 is 
moved to latch A, the multiplexers select A2, A3, 
and A4. and the aligner separates and outputs the 

30 instruction INST4 contained in the most significant 
24 bits of instruction data IDATA3. The instruction 
prefetch queue 15 receives the instruction INST4 
and stores it in the memory stage 15C. 

A timing diagram of the above aligner opera- 

35 tion is shown in Fig. 13. For purposes of simplicity, 
INST1 is identified in Fig. 13 as merely 1, INST2 is 
2, INST3 is 3, etc. 

Fig. 5 shows a format of the instruction ex- 
ecuted by the microprocessor 1 of this embodi- 

40 ment. 

The instruction consists of an operation code 
specification field 30 describing the kind of opera- 
tion to be performed and another specification field 
31 representing an operand itself for the operation 

45 specified by the operation code or information in- 
dicating the location of the operand. The instruction 
is a fixed length of 24 bits. For example, an 
operand specified by the location information 
(including information on the addressing mode de- 

50 scribed later) contained in the specification field 31 
is given through the data selector 25 to the data 
input register 1 3. The maximum word length of unit 
data worked upon by a single instruction depends 
on the maximum number of bits that can be "en- 

55 tered into the arithmetic logic unit and registers 
contained in the execution unit or on the bit num- 
ber of bits of the internal bus in the execution unit. 
In this embodiment, the maximum data word length 
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is 32 bits. The word length of data handled by the 
microprocessor of this embodiment is not nec- 
essarily the maximum word length. For instance, it 
may be 16 bits long. In that case, the data length 
need only be expanded from 16 bits to 32 bits. 

The operation code specification field 30 may 
include, in addition to an operation code such as 
LOAD and STORE, an addressing mode and in- 
formation on branch conditions for a status flag and 
a carry generated during computation. 

The second specification field 31 may include, 
as required, a register specification, immediate val- 
ue specification, and displacement. What kind of 
information is contained in this field 31 is identified 
by the information contained in the operation code 
specification field 30. The information for register 
specification specifies a register number contained 
in the execution unit 2; the immediate value di- 
rectly specifies a value such as data and address; 
and the displacement includes information used for 
branch destination address calculation in connec- 
tion with JUMP and CALL instructions. 

The number of bits of the operation code con- 
tained in the instruction may or may not be fixed. 
In this embodiment, the number of bits in the 
operation code specification field 30 is fixed to 8 
bits. The instruction decoder 16 in this embodiment 
accepts all bits of the instruction output from the 
instruction prefetch queue 15. Whether the informa- 
tion in the specification field 31 of the instruction 
received should be decoded or not is determined 
by a bit contained therein which specifies the de- 
coding. For instance,' if an immediate value is con- 
tained in the field 31, the information is not de- 
coded and the immediate value is fed to the execu- 
tion unit 2. 

In the microprocessor 1 of this embodiment in 
which the instruction word length is shorter than 
the data word length, there may be a case where 
the immediate value that is also used directly as an 
operand requires the same number of bits as the 
data word length. To cope with this situation, the 
immediate value equal in length to the data can be 
obtained from two instructions. For example, in the 
24-bit instruction format with 8 bits assigned for the 
operation code specification field 30, a set-high 
instruction and a set-low instruction are prepared 
which specify whether the 16-bit immediate value 
shall be fed to high-order 16 bits or low-order 16 
bits of the data input register 13. By executing 
such instructions successively, it is possible to set 
the 32-bit immediate value in a desired register. 
That is, depending on the decoded result of the 
instruction, it is controlled whether to store the 
. immediate value of the instruction in the high-order 
16 bits or low-order 16 bits of the data input 
register 13. The 16 bits where the data is not 
stored are set with logic 0 bits. In the next instruc- 
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tion cycle, the 32-bit data first stored in the data 
input register 13 is transferred to the arithmetic 
logic unit 1 1 . At the same time, the second half of 
the immediate value is written into the opposite 16- 
5 bit half of the data input register 13 with the 0 bits 
set to the remaining 16-bit half. This data is simi- 
larly output to the arithmetic logic unit 11, which 
combines the two data to recover the entire 32-bit 
data and then transfers it to a specified register. 
io Fig. 6 shows an address map in which a pro- 

gram memory 35 and a data memory 36 are lo- 
cated in the same address space and are each 
assigned different addresses. In the address space 
managed by the microprocessor of this embodi- 
es ment, the data memory 36 and the program mem- 
ory 35 are arranged in the same space, as shown 
in Fig. 6. On the other hand. Fig. 7 shows an 
address map in which a program memory 37 and a 
data memory 38 are located in separate address 
20 spaces. In the address map of Fig. 7, instructions 
are given to an instruction control unit 40 through a 
dedicated path 39 and data is given to an execu- 
tion* unit 42 through a dedicated path 41 . Hence, 
with the microprocessor with the address map of 
25 Fig. 7, it is virtually impossible to swap the loca- 
tions between the program memory and the data 
memory. This kind of address map is applied to an 
architecture of a dedicated processor such as a 
digital signal processor with a top priority given to 
30 the increased speed of operation, in which instruc- 
tion fetch and data transfer are parallelly carried 
out. In the architecture in which the program mem- 
ory 35 and the data memory 36 are managed by 
the memory map of Fig. 6, however, there is pro- 
as vided an excellent flexibility in address arrange- 
ment of memory locations storing data and instruc- 
tions. This kind of architecture is suited for multi- 
purpose microprocessors that are controlled by an 
operation program that may vary according to char- 
40 acteristics and functions of a controlled apparatus. 

Fig. 8 shows a microprocessor with a configu- 
ration similar to that of a single chip microcom- 
puter, in which a data memory 44 containing data 
to be processed by the execution unit 2 and a 
45 program memory 45 containing instructions to be 
decoded by the instruction control unit 3 are ar- 
ranged in the same semiconductor chip. In this 
example, the data input/output terminal of the data 
memory 44 and the data output terminal of the 
so program memory 45 have a common path equiv- 
alent to the transfer path 20, and their address 
signal input terminals are supplied with address 
signals through a common address bus. The ad- 
dress map of th data memory 44 and th program 
55 memory 45 corresponds to that of Fig. 6. 

Fig. 9 shows the configuration of another ex- 
ample of microprocessor, which has a bus interface 
unit 46 dedicated for instructions and a bus inter- 
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face unit 47 dedicated for data. The instruction bus 
interface unit 46 is connected at one end with a 
data output terminal of the program memory, and 
the data bus interface unit 47 is connected at one 
end with the data memory. Addresses are fed to 
these interface units through a common address 
bus. In this dedicated configuration, the aligner 27 
and the associated control logic 28, that are neces- 
sary when the common interface path 20 is used, 
can be eliminated. On the other hand, the micro- 
processor as a semiconductor IC circuit in this 
example has an increased number of external ter- 
minals. 

The above embodiments offer the following 
advantages. 

(1) In microprocessors with an architecture de- 
signed to increase the instruction execution 
speed by minimizing the number of kinds of 
operation codes, like an RISC type microproces- 
sor, the word length of instructions decoded and 
processed by the instruction control unit are 
fixed to a constant length which is shorter than 
the maximum number of bits of unit data or the 
maximum word length of data that can be han- 
dled by the processor unit. This reduces the 
number of virtually useless bit strings in the 
operation codes, as compared with a case 
where the instruction word length is set equal to 
the data word length. As a result, the utilization 
of memory that stores programs is improved. 

(2) In a system containing the RISC type micro- 
processors, the area in the program memory 
which was previously virtually wasted becomes 
relatively small because of the above-mentioned 
architecture. In applications that use limited ca- 
pacity on-board memory or on-chip program 
memory, it is thus possible to avoid problems, 
which one may encounter during system design, 
of ending up with insufficient capacity of pro- 
gram memory or having to up-scale the memory 
circuit. 

(3) As a means to make the instruction word 
length shorter than the data word length, the bus 
interface unit 4 is provided which, for optimum 
use of the program memory and data memory, 
makes the number of bits of instructions sup- 
plied to the instruction control unit 3 shorter than 
the number of bits of data transferred to and 
from the execution unit 2. Because of this, the 
number of bits transferred to and from external 
circuit or data and program memories can be 
determined arbitrarily without being restricted by 
the data word length or instruction word length. 

(4) The provision in the bus interface unit 4 of 
the interface path 20 used both for input/output 
of data and for input of instructions prev nts the 
access control on the data and instructions per- 
formed by th instruction control unit 3 from 



becoming complex. 

(5) The interface path 20 is given a sufficient 
width to transfer parallelly the maximum number 
of bits of unit data that can be handled by the 

5 execution unit 2. This prevents degradation of 

data transfer efficiency or data and instruction 
access efficiency. 

(6) The instruction control unit 3 is provided with 
the instruction prefetch queue 15 for accumulat- 

10 ing the instructions to be given to the instruction 
decoder 16. The bus interface unit 4 is provided 
with the aligner 27 and the control logic 28 that 
together can sort out each instruction from the 
aggregate of one instruction and a part of an- 

75 other instruction contained in#the information 
supplied through the interface path 20, and 
which distribute the sorted-out instructions to the" 
prefetch queue 15. This arrangement permits 
efficient input of instructions by utilizing the en- 

20 tire width of the interface path 20 when the 
interface path 20 is wider than the instruction 
word length. 

(7) Since the instruction control unit 3 manages 
the data and instructions in the same address 

25 space, the flexibility of address arrangement in 
storing data and instructions in memory can be 
assured. 

(8) The data memory and program memory 
connected to the interface path 20 are provided 

30 on the microprocessor chip, so that high-speed 
access to these memories is achieved. 
In the foregoing, the invention has been de- 
scribed in conjunction with the preferred embodi- 
ments. It should be noted, however, that the inven- 
ts tion is not limited to the above embodiments alone 
and that various modifications can be made without 
departing the spirit and scope of the invention. 

For example, while in the above embodiments 
the instruction words are shown to be 24 bits long 
40 and the maximum data words 32 bits long, they 
may have other appropriate lengths depending on 
the processing capability of the microprocessor 
and the configuration of the system to be con- 
trolled. 

45 In the preceding embodiments, information* of 

the instruction contained in other than the operation 
code specification field is also decoded as re- 
quired. It is possible to decode only the information 
contained in the operation code specification field. 

so Furthermore, the bus interface unit is not limited to 
a circuit that interfaces with external circuits. In a 
one-chip microprocessor with a built-in program 
memory and data memory, the bus interface unit 
may be a circuit that interfaces with the internal 

55 bus commonly connected to these m mories. 

Claims 
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1. A microprocessor for executing a fixed length 
instruction in a program memory (45) and pro- 
cessing data in a data memory (44), compris- 
ing 

an execution unit (2), 

an instruction control unit (3) including in- 
struction decoding means (16) for decoding 
the fixed length instruction and means for con- 
trolling said execution unit according to the 
instruction and 

a bus interface unit (4) connected both to 
said execution unit (2) and said instruction 
control unit (3) and being adapted to make the 
number of bits of the fixed length instruction 
supplied to said instruction control unit (3) 
smaller than the number of bits of data si- 
multaneously transferred to and from said ex- 
ecution unit (2) 

wherein at least said microprocessor is in 
a single semiconductor integrated circuit for- 
mat. 

2. The microprocessor of claim 1. wherein said 
bus interface unit (4) includes an interface path 
(20) used commonly for input and output of the 
data and for input of the instruction. 

3. The microprocessor of claim 2, wherein said 
interface path (20) has a width to parallelly 
transfer the number of bits of the data to and 
from the data memory (44) as same as the 
number of bits of the data simultaneously 
transferred to and from said execution unit (3). 

4. The microprocessor of claim 3, wherein said 
instruction control unit (3) includes an instruc- 
tion prefetch queue (15) that accumulates the 
instruction to be given to said instruction de- 
coding means (3), and said bus interface unit 
(4) includes means (25) for sorting out each 
instruction from an aggregate of one instruction 
and a part of another instruction contained in 
information supplied through said interface 
path and means for distributing said instruction 
sorted by said means to said instruction 
prefetch queue (15). 

5. The microprocessor of any of claims 2 to 4, 
wherein said data memory (44) and said pro- 
gram memory (45) are both connected to said 
interface path (20). 

6. The microprocessor of any of claims 2 to 5, 
wherein said instruction control unit (3) man- 
ages the instruction and the data in th same 
address space. 

7. The microprocessor of any of claims 1 to 6, 



wherein said execution unit (2) has a data input 
register (13), an ALU (11), a register file (tO), a 
data output register (14), address output regis- 
ter (12) and a plurality of internal buses 
5 (21. ..23), and said bus interface unit (4) has a 

data bus interface (17) and address bus inter- 
face (18). 

8. The microprocessor of any of claims 1 to 6, 
io wherein said bus interface unit (4) includes a 

first interface (47) used for input and output of 
the data and a second interface (48) used for 
input of the instruction. 

is 9. The microprocessor of claim 8, wherein said 
execution unit (2) has a data input register 
(13), an ALU (11), a register file (10), a data 
output register (14), an address output register 
(12) and a plural of internal buses (21 ...23). 

20 

10. A microprocessor for execution of an instruc- 
tion in a program memory (45) and processing 
• data in a data memory (44), comprising 
an execution unit (2), and 
25 an instruction control unit (3) including in- 

struction decoding means (16) for decoding 
the instruction whose word length is constant 
and smaller than a maximum number of bits of 
unit data that can be handled by said execu- 
30 tion unit (2), 

wherein at least said microprocessor is in 
a single semiconductor integrated circuit for- 
mat. 

35 11. A microprocessor for executing an instruction 
in a program memory (45) and processing data 
in a data memory (44), comprising 
an execution unit (2), and 
an instruction control unit (3) including in- 

40 struction decoding means (16) for decoding 

the instruction whose word length is constant 
and smaller than a maximum number of bits of 
the data that can be concurrently transferred to 
and from the data memory (44) and means for 

45 controlling said execution unit according to the 

instruction, 

wherein at least said microprocessor is in 
a single semiconductor integrated circuit for- 
mat. 

50 

12. The microprocessor of claim 10 or 11, further 
including 

an instruction prefetch queue means (15) 
for accumulating the instruction to be given 
55 said instruction decoding means (16), and 

means (25) for sorting out each instruction 
from an aggregate of one instruction and a part 
of another instruction contained in information 
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that is entered parallelly in a number of bits 
greater than a number of bits of the instruction, 
and means (27) for distributing the instruction 
by said means to said instruction prefetch 
queue (15). 

13. A microprocessor for executing a fixed length 
instruction in a program memory (45) and pro- 
cessing data in a data memory (44), compris- 
ing 

an execution unit (2), 

an instruction control unit (3) including in- 
struction decoding means (16) for decoding 
the fixed length instruction and means for con- 
trolling said execution unit (2) according to the 
instruction, and 

a bus interface unit (4) connected both to 
said execution unit (2) and said instruction 
control unit (3) and being adapted to make a 
number m of bits of the fixed length instruction 
supplied to said instruction control unit (3) 
smaller than a number n of bits of the data 
concurrently transferred to and from said ex- 
ecution unit (2). said bus interface unit (4) 
including an interface path (20) used com- 
monly for input and output of the data and for 
input of the instruction, said interface path (20) 
having a width to parallelly transfer a number j 
of bits of the data to and from the data mem- 
ory (44), 

wherein at least said microprocessor being 
in a single semiconductor integrated circuit 
format. 

14. The microprocessor of claim 13 wherein m = 
8k, n = 81, j = 8i. k<t, with k. I and i being 
positive integers. 

15. The microprocessor of claim 14, wherein k = 
3, i =4, and 1=4. 

16. A processor particularly adapted for RISC pro- 
cessing of instructions having a first preselec- 
ted word length and data units having a sec- 
ond preselected word length, the processor 
including a memory (44, 45), an execution unit 
(2), and instruction control unit (3) and a bus 
interface unit (4), all in communicative associ- 
ation, comprising 

means (26) for latching an aggregate of 
the instructions in the bus interlace unit (4) 
wherein the aggregate is sized greater than the 
first preselected word length, 

a storage queue means (15) for storing a 
plurality of the instructions in the instruction 
control unit (3), and 

means (27) for s lectively aligning th ag- 
gregate for storag in the queu means and 



transferring the instructions from the means for 
latching to the storage queue means in a word 
length less than the second predetermined 
word length. 

5 

17. The processor of claim 16, wherein the means 
for selectively aligning comprises an input reg- 
ister sized for storing a word of the second 
predetermined word length, and an output reg- 

10 ister sized for storing a word of the first pre- 

determined word length. 

18. The processor of claim 17, wherein the means 
for selectively aligning comprises means for 

75 selectively shifting the word of the input regis- 

ter for alignment of a portion of the word with 
the output register. 

19. The processor of any of claims 16 to 18, 
20 wherein the memory comprises a data mem- 
ory (44) storing the data units and a program 
memory (45) storing the instructions. 

20. The processor of claim 19, wherein the data 
25 memory (44) and the program memory (45) 

are disposed for sharing in the same address 
space. 

21. The processor of any of claims 16 to 20, 
30 wherein the bus interface unit (4) has an inter- 
face path sized for input of at least a word of 
the second preselected word length. 

22. A method of processing data units with an 
35 RISC-type instruction set wherein each in- 
struction is defined as having a first preselec- 
ted word length and the data units have a 
second preselected word length, larger than 
the first preselected word length, comprising 

40 forming a sequence of the instructions in a 

plurality of words of the second preselected 
word length, 

latching an each of the plurality of words 
into an aligner, 

45 selectively transferring a portion of the 

each from the aligner to a queue to form an 
other sequence of the instructions in an other 
plurality of words of the first predermined word 
length, and 

so concurrently decoding the other plurality of 

words and processing data units of words of 
the second predetermined word length in ac- 
cordance with the decoded instructions where- 
by improved efficiency of processing is 

55 achi ved by the transferring and decoding of 

the shorter instruction word lengths with th 
processing of th longer data unit word 
lengths. 

10 
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23. The method of claim 22, including storing the 
data units and instructions in a shared mem- 
ory. 
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